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Abstract 

Blurring  by  Gaussian  convolution,  or  by  a  Laplacian  of  a  Gaussian  kernel,  is  a  com- 
mon image  processing  technique  used  for  edge  detection,  multiresolution  represen- 
tation, motion  analysis,  matching,  and  other  early  visual  processing  tasks.  We  con- 
sider methods  for  computing  Gaussian  convolutions  on  discrete  grids,  and  also  con- 
sider difference-of-Gaussian  and  Laplacian-of-Gaussian  convolutions.  To  properly 
approximate  the  continuous  theory,  it  is  important  to  evaluate  the  elements  in  these 
large  kernels  by  integration  of  the  continuous  kernel  in  a  region  about  the 
corresponding  location.  We  discuss  complexities  of  implementations  of  Gaussian 
and  Laplacian-of-Gaussian  kernels  on  different  serial  and  parallel  computer  archi- 
tectures. We  further  consider  zero-crossings  of  filtered  images,  and  look  at 
methods  for  selecting  significant  zero-crossing  curves.  Finally,  we  discuss  how  to 
handle  borders. 


1.  Continuous  Convolution 

Many  image  processing  operations  rely  on  convolutions  against  the  image 
intensity  data.  While  many  operators  are  local,  and  can  be  computed  using  convolu- 
tions with  three-by-three  pixel  kernels,  there  are  some  operators  that  require  much 
larger  convolution  kernels.  For  example,  the  Marr-Hildreth  edge  operator  [1] 
requires  convolution  kernels  that  are  sometimes  as  large  as  fifty  by  fifty  pixels  in 
extent.  Most  of  the  large  kernel  filters  are  based  on  Gaussian  blurring  of  the  image 
data  either  to  compute  a  blurred,  smoothed  version  of  the  image  data,  or  to  com- 
pute a  difference-of-Gaussian  image,  or  a  Laplacian-of-Gaussian  image. 

We  will  give  a  number  of  computational  techniques  needed  for  efficient,  accu- 
rate digital  implementation  of  these  kernels  and  interpretation  of  the  results.  We 
will  also  analyze  the  computational  complexity  of  these  implementations,  using 
several  models  of  computation.  Specif icaUy,  we  consider  a  sequential  random 
access  computer,  a  pipeline  parallel  machine  with  a  finite  but  large  word  size,  and  a 
bit-serial  mesh-connected  parallel  computer.  Table  1  shows  the  models  of  computa- 
tion, properties,  and  types  of  machines  that  we  have  in  mind.  Many  of  the  compu- 
tational methods  discussed  here  are  well-known,  and  we  are  mainly  concerned  with 
bringing  together  a  compendium  of  observations  in  implementing  large  kernels 
based  on  Gaussians,  and  comparing  their  complexities  on  different  architectures.  A 
more  thorough  treatment  of  large  kernel  convolutions,  together  with  many  further 
results,  can  be  found  in  [2].    A  recent  article  by  Huertas  and  Medioni  [3]  also 
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Model 

Dcstriptioa 

Unit-time  Operations 

Examples 

SRAM 

Sequential  random 
access  usar.Iii!.?;     ~ 
uniprocessor 

Add.multiply  .fetch, 
store  of  words  (large 
word  size). 

VAX,  PC 

Pipeline 

Image  processing 
pipeline  machine; 
raster-scan  machine; 
vector  machine. 

Table-look-up;  add  of 
two  images, multiply  of 
image  by  constant, 
scroll  (sometimes  zero-cost). 

VICOM 
DeAnza, 
CRAY  XMP 

MCC 

Bit-serial  mesh- 
connected  computer 

Sum  or  multiply  of  two  1-bit 
images, access  neighbor  bit, 
1-bit  image  shift. 

MPP,  GAPP 

Table  1.    Three  models  of  computation  based  on  three  different  computer  architec- 
tures. 


contains  many  relevant  suggestions  for  implementation  of  the  types  of  kernels  dis- 
cussed here. 

A  Gaussian  convolution  is  defined,  in  two  continuous  variables,  by 
hix,y)  =  f  G(x-x\y-y')f(x'  ,y')dx'dy' , 

]R2 


where 


G(x,y)  = 


1 


(*2+y2)/2(T2 


lira^ 


Here  f{x,y)  is  the  input  image,  and  h{x,y)  is  the  desired  filtered  image.    In  this 
paper,  we  address  the  following  questions: 

(1)  How  can  the  computation  be  discretized? 

(2)  How  can  the  discretized  equation  be  computed  efficiently? 

(3)  How  should  the  borders  be  handled?  That  is,  since  f{x,y)  is  not  necessarily 
defined  over  all  of  R^,  the  convolution  must  be  converted  to  a  finite 
domain. 

The  need  for  discretization  is  imposed  since  the  input  function  is  in  fact  given 
at  a  discrete  grid  of  points,  say  f(ij),  where  i  and  ;  take  on  integer  values.  The 
temptation  is  to  convert  the  integral  into  a  sum,  replacing  the  integrand  by  point 
evaluations  at  the  grid  points,  i.e., 

HiJ)  ~  ^^G(i-i',j-j')f{i'J')    . 
i'  j' 

This  can  lead  to  inaccuracies,  however. 
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Instead,  for  all  convolutions,  including  Gaussian  convolution,  the  integration 
should  be  discretized  by  modeling  the  continuous  f{x,y)  from  the  given  discrete 
data /(/,;■).  The  easiest  method  is  to  assume  that/(jc.y)  is  obtained  from  /(/,;)  by 
nearest-neighbor  interpolation.  Then  the  continuous; integral  can  be  computed  from 
the  discrete  sum  in  the  form:  ai;.hu£.!  '' s 

«■'  ;•'  . ]£i...  ■' 

where  isi^?^' 

—  -i+l/2      ./+l/2^-'     •'      ,     , 

^(''^■)=j;-m  ^-1/2  ^^^'y^^'^y- 

Thus  the  discrete  kernel  elements  should  be  obtained  from  block  averages  of  the 
continuous  kernel,  and  not  by  point  evaluations. 

If  bilinear  interpolation  is  used  to  model'  f{x,y)  instead  of  nearest  neighbor 
interpolation,  the  formula  becomes:       — — 

G{i,j)  =  S'^l   Sp^Gix,y)-(l-  \i-x  |)-(1-  \j-y  \)dxdy. 

Higher  order  interpolants  can  be  treated  similarly.  In  general,  the  discrete  kernel 
elements  are  obtained  by  integrating  the  continuous  kernel  against  the  basis  func- 
tions of  the  interpolation  method. 

The  result  of  the  block  averaging  procedure  is  that  the  kernel  values  can  be 
changed  somewhat.  Figure  1  shows  a  one-dimensional  Gaussian  curve  with  dots  at 
the  integer  points,  superimposed  on  a  bar  graph  representing  the  block  average 
values.  The  bar  graph  gives,  in  a  sense,  a  more  accurate  discrete  Gaussian  convolu- 
tion. Further,  the  (infinite)  sum  of  the  averaged  values  G(iJ)  are  guaranteed  to 
sum  to  one.  Interestingly,  the  infinite  sum  of  the  evaluates  of  the  Gaussian  G(iJ) 
also  very  nearly  sum  to  one,  as  long  as  a  is  larger  than  about  0.5  pixels.  This 
occurs  because  the  errors  tend  to  cancel,  due  to  the  fact  that  the  Gaussian  has  a 
region  of  negative  curvature,  and  another  region  with  positive  curvature.  The  first 
row  of  Table  2  gives  the  sum  of  the  kernel  values  for  different  values  of  o-  for  a 
two-dimensional  Gaussian  kernel.  However,  the  fact  that  the  point  evaluates  of  the 
Gaussian  very  nearly  sum  to  one  doesn't  justify  their  use  for  Gaussian  convolution 
—  the  averaged  values  G  form  a  better  approximation  to  the  continuous  theory. 

Indeed,  assume  that  the  image  data  is  bounded,  say  \f(x_,y)\^M.  We  have 
denoted  the  convolution  using  block  average  kernel  values  G(i,j)  by  h{ij).  If 
point  evaluates  G(iJ)  are  used  instead,  let  us  denote  the  result  by  hp{i,j): 

hpdJ)  =  ^J,G{i-i',j-j')f{i'J')  . 
i'  j' 

We  can  then  bound  the  difference: 

\h{i,j)-hpii,j)\  ^Si:lG(''J')  -  G{i',j')\-M  . 
i'  y 

The  bound  can  be  attained,  and  thus  the  sum  of  absolute  differences  of  the  kernel 
elements, 
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Fignre  1.  A  one-dimensional  Gaussian,  with  a=1.0.  Dots  mark  point  evaluates  of 
the  Gaussian  at  each  integer.  The  heights  of  the  bars  in  the  bar  graph  give  the  mean 
values  of  the  Gaussian  in  a  block  of  width  1  centered  at  each  integer. 


Sum 

<T 

.50 

.60 

.75 

1.0 

1.5 

2.0 

3.0 

4.0 

5.0 

2S<5(«J) 

'  J 

1.02897 

1.00328 

1.00006 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

22IG(U)-G(u)| 

0.31213 

0.17204 

0.09651 

0.058 

0.026 

0.015 

0.0068 

0.0038 

0.0024 

'  J 

-1.15203 

-0.12971 

-0.00238 

~  0.0 

~  0.0 

-0.0 

~  0.0 

~  0.0 

-  0.0 

22lAG(iJ)-AG(i,;)| 

3.84728 

1.86594 

1.17351 

0.768 

0.414 

0.225 

0.1039 

0.0593 

0.0382 

Table  2.   The  sum  of  point-evaluate  kernel  elements,  where  G  (,x,y)  is  a  Gaussian  in 
two  variables  with  standard  deviation  a  and  total  integral  1. 
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measures  a  proportionality  factor  for  the  maximum  error  in  using  point  evaluates. 
These  values  are  listed  in  the  second  row  of  Table  2  for  various  values  of  ct. 

The  need  for  averaging  the  continuous  kernel  rather  than  point  evaluations  is 
more  critical  when  Laplacian-of-Gaussian  convolutions  are  attempted.  We  will 
shortly  discuss  alternate  methods  of  performing  these  convolutions,  but  sometimes  it 
is  desired  to  perform  a  full  image  convolution  against  the  analytically  calculated  ker- 
nel.  In  the  case  of  Laplacian-of-Gaussian  convolution,  the  kernel  is  given  by:^ 


AG(x,y)  =  C 


x^+y'^   _ 


.-{x^+y^)/2<T^ 


Here  C  is  a  constant,  which  if  a  precise  Laplacian-of-Gaussian  is  desired  is  equal  to 
l/(2ira'*).  Once  again,  point  evaluations  of  the  kernel  lead  to  different  values  than 
block  averages.  Figure  2  depicts  the  one-dimensional  situation  with  the  Laplacian  of 
the  Gaussian,  with  o-=1.0  .  This  time,  kernel  elements  should  sum  to  zero.  Once 
again,  the  block-averaged  kernel  elements,  (AG)ii,j),  are  guaranteed  to  sum  to 
zero,  while  the  point  evaluates  AG(i,j)  serendipitously  very  nearly  sum  to  zero,  as 
long  as  CT  is  sufficiently  large  (greater  than  1.0  suffices).  The  third  row  of  entries  in 
Table  2  gives  the  values  for  the  infinite  sum  of  kernel  elements.  The  error  bound 
for  using  point  evaluates  of  the  Laplacian-of-Gaussian  as  opposed  to  block  averages 
can  be  measured  by  the  sum  of  absolute  differences: 

»■  j 
These  values  are  listed  in  the  fourth  row  of  Table  2  for  varying  values  of  ct.   We  see 
that  point  evaluates  can  lead  to  rather  large  errors,  for  example,  as  high  as  10% 
relative  error  for  a  =  3  pixels. 

Analytic  evaluation  of  the  averages  of  the  kernels  is  possible  in  terms  of  the 
error  function  erf.   For  example,  block  averages  of  the  Gaussian  are  given  by: 

GiiJ)  =  / 


4 


erf 


i-l/2 
-1/2 


1-7  +  1/2 

f:_^^^  G{x,y)dxdy  = 


erf 


J +  1/2 


[  V2  0 


-erf 


J- 1/2 


V2(T 


where   erf  is   the   error   function    associated   with   the   normal   distribution,    i.e., 
erf(x)  =  J^(2/\^)e~'^dt.    (Erf  is  a  built-in  function  on  most  Fortran  compilers). 

0 
Similarly,  block  averages  of  the  Laplacian-of-Gaussian  kernel  are  given  by 


We  use  the  notation  "  A  "  to  denote  the  Lapladan  operator,  as  is  common  in  mathematical  analysis.  In 
physics,  the  notation  "  V^  "  is  often  used  for  the  Laplacian  operator  in  TR}  (three-dimensional  space),  as  a 
shorthand  for  the  divergence  of  the  gradient.  The  notation  is  explained  by  the  fact  that  the  gradient  of  a  func- 
tion/is generally  denoted  by  V/,  and  the  divergence  of  the  result  is  denoted  V-V/.  The  notation  "  V^  "  has  been 
borrowed  and  applied  in  IR"  for  more  general  n  by  many  disciplines. 
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Figure  2.  The  Laplacian  of  a  Gaussian  in  one  dimension,  i.e.,  the  second  derivative 
of  a  Gaussian,  with  a  =  1.0.  The  point  evaluates  and  mean  values  at  each  integer  are 
shown  as  in  Figure  1. 


^[^'(,  +  1/2)-, '0-1/2)]    erf(^ 


-  erf 


7-1/2 


where  ^'U)  =  {-xl^/l^a^)e-'^''^''^ . 


For  more  general  continuous  kernels,  analytic  evaluation  may  be  impossible  or 
simply  too  much  of  a  bother.  The  averaged  kernel  can  then  be  computed  (off-line) 
by  numerical  quadrature  methods,  using  standard  numerical  integration  package 
software.  Simpson's  method  for  one-dimensional  integration  usually  suffices  for  the 
kernels  under  consideration.   More  sophisticated  techniques  can  also  be  used. 

A  separate  issue  when  implementing  these  convolutions  concerns  quantization 
and  truncation  of  the  kernel.  In  practice,  the  kernel  elements  are  typically  approxi- 
mated by  rational  numbers,  so  that  integer  arithmetic  may  be  used.  Depending  on 
the  number  of  bits  used  to  represent  the  kernel  elements,  the  kernel  will  be  effec- 
tively truncated  to  a  finite  domain,  since  elements  outside  of  some  radius  will  be 
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approximated  by  zero.  Further  truncation  is  also  possible.  When  using  these  ker- 
nels, errors  come  from  two  separate  sources:  from  the  quantization  error  due  to  the 
approximation  of  the  kernel  elements  by  rational  values,  and  the  truncation  error 
due  to  the  omission  of  kernel  elements  outside  of  a  given  radius.  The  first  error  is 
controlled  by  the  number  of  bits  used  in  the  representation  of  the  kernel  elements. 
The  truncation  error  is  dependent  on  the  radius,  u&Sd  ,for  the  implementation  of  the 
kernel.  For  bounded  image  data,  the  truncation  jerror.is  bounded  by  an  amount  pro- 
portional to  the  integral  of  the  kernel  outside  of  the  truncation  radius. 

Hildreth  notes  that  the  central  negative  regionin  AG(x,y)  has  radius  w  =  \/2ct, 
and  then  suggests  that  the  Laplacian-of-Gaussian  kernel  be  implemented  with  a 
radius  of  4^=5.660-  [4].  In  Table  3,  we  show  the  integrals  of  the  continuous  ker- 
nels G(x,y)  and  AG(x,y)  to  various  truncation  radii  t<T.  (Actually,  the  table  gives 
the  integrals  of  the  kernels  in  a  box  of  size  2t<T  by  Ita) .  We  also  give  the  magni- 
tude of  the  kernel  values  at  the  outer  radius  location.  It  should  be  noted  that  this 
table  is  independent  of  a.  The  differences  between  the  integral  values  over  the  fin- 
ite domain  and  the  integrals  over  the  infinite  domain  give  estimates  of  the  truncation 
error.  The  quantization  error  is  harder  to  estimate,  but  can  be  made  small  by  using 
many  bits  in  the  representation  of  the  kernel.  The  magnitude  of  the  kernel  at  the 
truncation  error  gives  an  indication  of  the  minimum  number  of  bits  required.  For 
convolution  against  a  Gaussian,  a  kernel  size  out  to  a  radius  of  3.5ct  yields  an  accu- 
racy to  within  .001  times  the  image  data  bound,  and  will  require  kernel  elements 
with  12  bits.  To  achieve  a  .0001  accuracy,  a  radius  of  4.1a  is  needed,  and  kernel 
elements  will  require  at  least  16  bits.  For  convolution  against  the  Laplacian-of- 
Gaussian  kernel,  .001  accuracy  is  achieved  with  a  4.2o-  truncation  radius,  requiring 
12  bit  kernel  elements,  while  .0001  accuracy  yields  a  4.8ct  radius  and  16  bits.  At  the 
radius  5.66ct,  the  truncation  error  for  Gaussian  convolution  is  infinitesimal,  and  the 
kernel  elements  at  the  truncation  radius  have  magnitude  roughly  10"*,  so  that  32 
bits    for    the   kernel   elements    is    not   unreasonable.     For    Laplacian-of-Gaussian 


Value 

t 

1. 

2. 

3. 

4. 

5. 

6. 

0.466065 

0.911070 

0.994608 

0.999873 

0.999999 

1.00000 

pto    pta 

LJ..AG<.',y)dxdy 

-0.660764 

-0.412275 

-5.30386e-02 

-2.14115e-03 

-2.973446-05 

-1.458216-07 

|G(f<T,0)| 

9.65324e-02 

2.153936-02 

1.76805e-03 

5.33905e-05 

5.93115e-07 

2.42393e-09 

AG(r(T,0) 

-9.65324e-02 

4.30786e-02 

1.23764e-02 

7.47468e-04 

1.364176-05 

8.241356-08 

Table  3.  Values  for  the  integrals  of  a  Gaussian  and  Laplacian-of-Gaussian  in  a  box 
of  increasing  size.  We  also  give  the  value  of  these  functions  at  the  truncation  loca- 
tion nearest  the  origin. 
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convolution,  a  radius  of  5.66a  will  give  roughly  one  part  in  10^  accuracy,  and  ker- 
nel elements  with  at  least  21  bits  should  be  used. 

Of  course,  high  accuracy  i;  an  issue  only  if  one  believes  that  it  is  important 
that  the  appropriate  kernel  (C'aKJS^an  or  Laplacian-of-Gaussian)  be  implemented 
accurately.  For  exainp'..  ,:  i^sinf  point  evaluates  for  the  kernel  elements  leads  to 
potentially  different  result?,  and  thus  is  not  necessarily  an  accurate  implementation. 
However,  it  might  happen  f  at  for  the  intended  application,  the  differences  are 
inconsequential,  and  the  Gaussian  or  Laplacian-of-Gaussian  can  be  replaced  by 
more  general  kernels. 

The  complexity  of  pointwise  convolutions,  without  making  use  of  separable 
kernels  or  other  decomposition  methods  (discussed  below),  is  a  simple  matter.  On 
a  uniprocessor,  the  convolution  of  an  N  hy  N  image  with  an  m  by  m  kernel  will  take 
time  O(m^N^),  or  O(m^)  time  psr  pkel.  A  pipeline  machine  will  frequently  have 
special-purpose  hardware  for  p&rformir.g,  say,  p  hy  q  convolutions  in  one  pass 
through  the  data.  In  this  case,  G(m^lpq)  passes  will  be  needed.  Finally,  on  a  bit- 
serial  mesh-connected  computer,  m^  multiply-accumulates  are  needed,  with  m^ 
shifts  of  the  image  data.  Each  multipiy-accumulate  takes  a  fixed  amount  of  time, 
depending  on  the  number  of  bits  ased  to  represent  the  kernel  elements  and  the 
number  of  bits  used  to  store  the  results.  Suppose  that  the  accumulates  contain  b 
bits.  Then  the  whole  process  is  0(m?b)  complexity. 

Rather  than  converting  these  asymptotic  complexities  into  typical  running  time 
estimates,  we  defer  further  calculations  until  the  end  of  the  next  section.  In  that 
section,  we  discuss  a  couple  of  computational  short-cuts  that  have  a  profound  effect 
on  both  the  asymptotic  complexities  and  the  expected  running  times. 

2.  Iterated  Discrete  Convolution 

Two  computational  tricks  permit  rapid  discrete  Gaussian  convolution.  The  first 
trick,  almost  always  used  in  practice,  takes  advantage  of  the  fact  that  the  Gaussian  is 
a  separable,  symmetric,  kernel.  This  has  the  consequence  that  a  two-dimensional 
convolution  can  be  performed  by  iterating  two  one-dimensional  symmetric  convolu- 
tions.  That  is,  when  the  kernel  K(i,j)  is  separable,  given  say  by 

KiiJ)  =  kiU)-k2U), 
then  the  convolution  can  be  performed  according  to  the  formula 

^^K{i-i',j-j')f{i',j')  =  2iti(/-z')     ^k2{j-j')f{i',j') 

»•'  y  «•'  L  / 

Thus  the  horizontal  convolution  against  k2  is  performed  first,  and  the  result  is  con- 
volved against  the  vertical  kernel  k\.  If  the  kernel  is  m  by  m,  then  this  observation 
reduces  the  complexity  from  O(m^)  per  pixel  to  0(m).  For  the  Gaussian  kernel, 
both  the  continuous  and  discrete  versions  are  separable.  Further  computational  sav- 
ings can  be  realized  on  a  sequential  machine  by  making  use  of  the  symmetry  of  the 
kernels.   Specifically,  the  one-dimension  convolutions 

^ki(i-i')fii') 
i' 

can  be  implemented  as 
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ki(O)fii)  +   ^kiii')(fU-i')  +/(/  +  /'))  , 

since  ifei(-/)  =  iti(/).  This  trick  saves  a  few  mukipliesv  but  is  generally  not  helpful 
on  a  mesh-connected  parallel  architecture.  In  Tr.bies  44.5  and  6  we  refer  to  the  use 
of  the  symmetry  and  separability  of  the  kernel  as  gfiwnpuiation  Method  A . 

The  second  trick  applicable  to  Gaussian  oSnvbhitidii'  makes  use  of  the  fact  that 
the  binomial  coefficients  form  a  rapid  approxiiiiadbrf  t6  a  normal  distribution.  It  is 
well-known  that  the  binomial  coefficients,  given 'by'^"""      ' 

n! 


approximate  a  normal  distribution 


n    ^ 


1 


(n-k)\k\ 


t^/2iT^ 


VStTo- 

where  x  =  k—nll,  and  a  =  (\/n/2).  Binoniial"  coefficients  are  most  easily  calcu- 
lated using  Pascal's  triangle,  and  lead  to  the  design  of  an  iterated  convolution  ker- 
nel, formed  as  follows.  The  initial  data  is  /(/),  i=  •  •  •  ,  —  1,0,1,  •  •  •  ,  (in  one 
dimension),  which  we  set  equal  to  /io(0-  We 'define  hsii),  for  N>Q,  by 

,    _        hN-i{i-l)+ihN-i{i)  +  hN-i{i  +  l) 
hsii)  =  A • 


It  follows  that 


^^^^^-^i^wik^Nh'"-'^- 


This  is  a  close  approximation  (for  large  N)  of  convolution  against  the  Gaussian  with 
a  =  y/N/2.    This  method  of  Gaussian  convolution  has  the  advantage  of  yielding  a 


(T  = 

1 

<T  =  2 

<j=A 

Without  Methods  A  or  B,  4<t  cutoff 

ops 

ac 

sz 

ops 

ac 

sz 

ops 

ac 

sz 

162 

81 

(9x9) 

578 

289 

(17x17)' 

2178 

1089 

(33x33) 

Method  A,  4<7  cutoff 

28 

20 

(9x9) 

52 

36 

(17x17) 

100 

68 

(33x33) 

Method  A,  6a  cutoff 

42 

28 

(13x13) 

76 

52 

(25x25) 

151 

104 

(51x51) 

Method  B 

12 

16 

(5x5) 

48 

64 

(17x17) 

192 

256 

(65x65) 

Table  4.  Serial  Random-Access  computer  model:  number  of  arithmetic  operations  (ops)  and 
number  of  memory  accesses  (ac)  per  image  pixel  point  for  Gaussian  convolution  using  various 
methods  of  computation  for  various  sizes  of  Gaussians.  The  effective  size  of  the  2-D  convo- 
lution is  given  in  the  size  (sz)  column.  Generally,  Method  B  (Binomial  coefficients)  is  useful 
only  for  small  a  on  this  model  of  machine. 
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finite  computation,  using  simple  arithmetic. 

In  two  dimensions,  a  similar  iterative  blurring  scheme  can  be  devised,  formed 
firom  the  composition  of  two  one-dimensional  blurring  steps  as  given  above.  Specif- 
ically, ho(ij)  is  the  mitisd^j^r.z^'  data,  and  hsiij)  is  obtained  from  hN-i(i,j)  by 
convolution  against  the  thtee-!fy-three  mask 

;  !  ;•  JL 

^       16 

Once  again,  hs{i,j)  will  approximate  a  convolution  of  the  initial  data  f{i,j)  against 
a  Gaussian  with  (t=VN/2.  In  Tables  4,  5,  6,  we  refer  to  this  method  of  Gaussian 
convolution  as  Method  B. 

For  a  serial  random  access  computer  (such  as  a  VAX),  making  use  of  the 
separability  of  symmetry  of  the  kernel  (Method  A)  results  in  a  large  savings  in 
terms  of  number  of  computations,  particularly  for  large  a.  The  iterative  approach 
(Method  B)  is^nly  useful  for  small  a.  where  the  approximation  to  a  Gaussian  will 
not  be  as  good.  Table  4  lists  computation  counts  using  a  serial  random-access 
machine  for  different  values  of  n,  with  computation  Method  A,  with  Method  B,  and 
without  either  method.  We  have  listed  arithmetic  operation  counts  separately  from 
accesses  to  image  memory,  since  different  machines  may  have  radically  different 
timings  for  these  two  kinds  of  operations.  However,  we  have  lumped  together  mul- 
tiplications and  additions  into  the  same  "arithmetic  count"  figure. 

For  a  pipeline  machine,  a  critical  issue  is  the  size  of  convolutions  that  can  be 
performed  in  a  single  pass.  We  assume  that  addition  and  scalar  multiplication  of 
two  images  requires  a  pass  through  the  entire  image,  but  many  systems  are  capable 
of  3  by  3  convolutions  or  larger  in  a  single  frame  time.  In  Table  5,  we  list  numbers 
of  passes  required  for  Gaussian  convolution,  considering  machines  capable  of  one 
by  one,  three  by  three,  and  seven  by  seven  convolutions  in  a  single  pass.  In  all 
cases,  we  assume  that  accumulative  writes  are  allowed,  i.e.,  that  the  result  of  a  local 
convolution  can  be  summed  with  an  existing  image  without  an  additional  pass. 
We've  also  assumed  that  scrolling  the  image  data  is  free,  since  the  shift  can  usually 
be  combined  with  another  operation  by  setting  offset  registers.  When  these 
assumptions  do  not  hold,  the  counts  for  number  of  passes  wiU  increase.  Specifi- 
cally, if  additive  writes  are  not  allowed,  then  operation  counts  will  increase  propor- 
tionately, since  each  multiply  must  be  followed  by  an  addition  pass.  If  image  scrol- 
ling requires  a  pass  (to  move  the  data),  then  costs  are  similar  to  costs  for  a  mesh- 
connected  computer,  treated  below,  where  movement  of  data  incurs  communication 
costs. 

We  see  that  on  a  pipeline  machine,  the  iterative  method  (Method  B)  is  espe- 
cially useful  when  three-by-three  convolutions  are  allowed,  as  long  as  a  is  not  too 
large.  If  the  hardware  permits  larger  single-pass  convolutions,  a  similar  iterative 
method  can  be  defined  where  each  iteration  takes  advantage  of  the  full  available 
convolution  size  capabilities. 

For  a  mesh-connected  parallel  computer,  the  situation  is  similar  to  the  pipeline 
computer  case,  except  that  now  the  number  of  bits  retained  in  each  convolution 
makes  a  difference.   Further,  there  may  be  large  differences  between  bit  operations 
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:    -T^o 

-.v  ■ 

rfi'l'-' 

'a«*2 

a=4 

Neither  A  nor  B,  4<t  cutoff 

l,  '-■{c 

•^  -     1*.  . 

1x1 

81 

289 

1089 

3x3 

18 

64 

121 

7x7 

8 

9 

25 

Method  A,  4<t  cutoff 

1x1 

18 

34 

66 

3x3 

6 

12 

22 

7x7 

4 

6 

10 

Method  B 

1x1 

12 

48 

192 

3x3                  . 

.2 

8 

32 

7*7 

2 

8 

32 

Table  5.  Number  of  passes  required  on  a  pipeline  machine  for  convolution  by  a 
Gaussian  using  various  methods  of  computation.  For  each  method,  we  list  figures 
for  three  different  architectures:  one  capable  of  only  image  sums  and  multiplies,  one 
capable  of  single-pass  3  by  3  convolutions,  and  one  capable  of  7  by  7  convolutions. 


on  the  processor  and  communication  costs  to  access  a  bit  on  a  neighboring  proces- 
sor. We  thus  count  arithmetic  bit  operations  separate  from  bit  communication 
counts  in  Table  6.  We  assume  that  the  image  data  is  given  with  8  bits  of  precision. 
For  Method  A,  we  assume  that  the  convolution  is  performed  with  b  bits  of  preci- 
sion. For  greatest  efficiency,  the  image  data  is  moved  among  the  processors,  and 
the  convolution  is  accumulated  in  place.  For  Method  B,  we  consider  both  the  case 
where  b  bits  are  used,  and  where  full  precision  is  retained  by  allowing  the  integers 
to  grow  as  large  as  needed. 

We  once  again  see  that  Method  B,  the  iterative  approach,  is  less  valuable  as  a 
increases.  This  is  because  the  number  of  iterations  required  increases  with  cr^. 
However,  since  the  binomial  coefficient  approximation  to  a  Gaussian  used  in 
Method  B  is  asymptotically  accurate  as  a-*^,  we  might  well  ask  whether  Method  B 
has  any  utihty.  One  answer  is  that  the  utility  depends  upon  the  architecture.  Cer- 
tain pipeline  machines  have  multiple  stages,  and  can  thus  implement  multiple  itera- 
tions rapidly.  Accordingly,  the  range  in  which  the  iterative  approach  is  favorable 
may  well  include  values  of  a  of  interest.  A  second  answer  is  addressed  in  Section 
5,  where  we  discuss  iterative  approaches  to  handling  borders. 

3.  Difference-of-Gaussians 

A  number  of  operations  are  based  on  difference-of-Gaussian  (DoG)  or 
Laplacian-of-Gaussian  (LoG)  operations.  The  Marr-Hildreth  edge  operator  is  one 
such;  another  is  the  DOLP   [5].    First,  it  should  be  noted  that  the  DoG  is  an 
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- 

ff= 

=  1 

<T  = 

=2 

<7= 

=  4 

Neither  A  nor  B, 

4<T  cutofv 
6a  cutoff 

pps 

com 

ops 

com 

ops 

com 

729b 
lj21b 

648 
1352 

2601b 
5225b 

2312 
5000 

9801b 
23409b 

8712 
20808 

Method  A 

4<T  cutoff 
6(7  cutoff 

162b 
234b 

144 
208 

306b 
450b 

272 
400 

594b 
918b 

528 
816 

Method  B 

b  bits  precision 
Full  precision 

8b 
72 

8b 

32b 
672 

32b 
608 

128b 
8832 

128b 
8376 

Table  6.  Number  of  bit-s'jj.'j-d  j.  XPtioas  (ops)  and  neighbor  bit  communication 
accesses  (com)  required  for  Gs^u^u.^  coiivoiutions  on  a  mesh-connected  parallel  ar- 
chitecture machine.  It  is  assumed  thsx  iihe  image  data  is  stored  as  8  bit  numbers,  one 
per  processor,  and  that  the  output  convolution  values  will  have  b  bits  of  precision 
(except  for  Method  b,  "Full  precision"  mode).  Typically,  b  will  depend  on  <t,  but 
will  generally  be  at  least  16. 


approximation  to  the  LoG,  in  the  sense  that 

AG(x,y)  =    lim        ^^^  ,  {G„^ix,y)-G„2(x,y)]. 

If  CTi  and  CT2  are  very  different,  then  the  approximation  is  not  very  good.  In  many 
schemes,  such  as  the  Burt  Laplacian  pyramid  [6],  the  effective  DoG  is  formed  with 
the  ratio  of  cti  and  0-2  fixed.  In  fact.  Mart  and  Hildreth  suggested  a  separation 
according  to  (o-i/a2)  =  1.6  .  This  suggestion  is  often  misinterpreted  as  a  statement 
that  the  LoG  is  best  approximated  by  a  DoG  with  CTi/a2  =  1.6,  which  is  clearly  not 
true. 

Image  2  shows  the  zero-crossings  at  a  particular  scale  for  the  difference-of- 
Gaussians  applied  to  the  image  showed  in  Image  1,  where  the  ratio  of  the  cr's, 
CTi/o-2  =  1.6,  while  Image  3  shows  the  zero-crossings  using  the  Laplacian  of  the 
Gaussian  using  scale  02-  For  Image  2,  we  used  ai  =  6.4  and  trj  =  4.0,  while  the 
LoG  example  in  Image  3  is  based  on  a  =  4.9.  In  this  example,  we  find  that  the 
DoG  provides  slightly  better  correlation  of  the  zero-crossings  with  edges  and  signifi- 
cant features  in  the  original  image.  We  might  also  compare  the  DoG  with  an  LoG 
with  a  different  intermediate  value  of  ct,  but  our  experience  with  such  comparisons 
is  similar  to  the  one  shown.  Namely,  we  find  that  the  location  and  density  of  zero- 
crossings  is  slightly  more  stable  when  a  DoG  is  used. 
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Image  1.  An  original  image,  digitized  to  512  by  512  pixels. 
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Image  2.  The  zero-crossings  of  the  Difference-of-Gaussian  filtered  version  of  the 
image  in  Image  1,  where  the  Gaussians  have  standard  deviation  a's  corresponding  to 
4  pixels  and  6.4  pixels. 
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Image  3.  The  zero-crossings  of  the  Laplacian-of-Gaussian  filtered  version  of  the  im- 
age in  Image  1,  where  the  Gaussian  has  standard  deviation  a  corresponding  to  4.9 
pixels. 


If  a  pyramid  scheme  is  desired,  then  the  Burt  method  is  by  far  the  one  of 
choice.  The  effective  ratio  of  scales,  0-2 /ai,  is  equal  to  2  in  the  standard  scheme 
where  pyramid  levels  decrease  in  size  by  a  factor  of  2  between  levels.  Other 
schemes,  however,  are  possible  [2,7].  If  a  DoG  is  desired  at  a  fixed  maximum 
resolution,  then  for  greatest  efficiency,  the  two  Gaussian-convolved  images  should 
be  computed,  and  some  care  should  be  exercised  in  computing  the  difference,  since 
numerical  precision  may  be  difficult. 

If  the  LoG  is  desired,  despite  advantages  of  the  DoG,  it  can  be  calculated  by 
any  of  the  following  four  methods: 

(1)  U^e  the  Laplacian-of-Gaussian  kernel,  averaged  in  blocks,  to  convolve 
against  the  data  f(i,j),  as  discussed  in  the  previous  section.  Note  that  the 
kernel  is  not  separable,  so  that  the  resulting  convolution  will  be  expensive. 

(2)  First  compute  a  Laplacian  of  the  image  data,  using  a  four-point  Laplacian, 
i.e., 

L(i,j)  =/(/-lJ)+/('  +  lJ)+/(M+l)+/(U-l)-4/(z,;), 

and  then  Gaussian-blur  the  result.    In  this  method,  numerical  precision  in 
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the  blurring  of  L(i,j)  is  a  major  concern,  since  if  insufficient  bits  for  the 
representation  are  provided,  the  result  of  any  substantive  amount  of  blurring 
will  be  zero. 

(3)  Compute  blurred  vetsi^ns  of  the  data  fUJ)  using  the  functions  hff(i,j)  as 
described  befors.'olfiiPiMkg  full  accuracy  by  retaining  all  necessary  bits.  The 
LoG  is  approximated  using  h\f+i—hN. 

(4)  Use  the  Huertas-Medioni  decomposition  of  the  Laplacian-of-Gaussian  into 
the  sum  of  two  geparable  kernels  [8].  Specifically,  the  decomposition  is 
given  by 


where 


AGU,y)  =  e-ki(x)h2(y)  +  C  ■h2ix)-hi(y), 

h^(x)=   p-jlj -'-'"'"'.   h2(y)  -  e-y"^'^\ 

Note  that  in  each  ot  die  two  separable  kernels,  one  of  the  two  dimensions 
uses  a  Gaussian  convolution.  The  other  convolution  in  each  of  the  two  ker- 
nels is  a  second  derivative  of  a  Gaussian  in  one  dimension.  The  decomposi- 
tion arises  since  ^(gix')giy)}  --^  g"(x)g(y)  +  g{x)g"{y). 

For  the  examples  in  Image  2  and  3,  we  used  double-precision  floating-point 
representations  of  the  image  and  kernel  elements,  for  maximum  accuracy.  Despite 
the  computational  advantages,  we  did  not  use  Methods  A  nor  B,  nor  any  other  spe- 
cial property  of  the  kernels.  We  used  extremely  large  kernels  to  essentially  elim- 
inate truncation  errors.  When  we  used  any  of  the  approximation  methods  or  com- 
putational speed-ups  discussed  above  on  a  limited-precision  processor,  the  results  of 
applying  the  zero-crossings  operation  were  frequently  disconcertingly  different. 
The  difficulty  is  that  the  location  of  the  zero-crossings  can  be  extremely  sensitive  to 
small  changes  in  the  filtered  images.  There  were  always  a  number  of  zero-crossings 
(with  high  gradients  in  the  filtered  data)  that  were  not  affected  by  the  approxima- 
tion methods,  but  many  of  the  zero-crossings  belonging  to  texture  edges  and  weak 
edges  were  considerably  changed. 

For  the  computational  costs  of  these  methods,  one  can  refer  to  the  appropriate 
cost  of  Gaussian  blurring.  Specifically,  Laplacian-of-Gaussian  convolution  by 
Method  (1)  above  is  equivalent  to  Gaussian  convolution  without  Method  A  or  B. 
The  appropriate  a  cutoff  for  LoG  convolution,  however,  will  be  larger  than  the  ct 
cutoff  for  Gaussian  convolution.  Methods  (2)  and  (3)  above  are  essentially  Gaus- 
sian convolution,  using  Methods  A  and  B  respectively.  Finally,  the  Huertas- 
Medioni  decomposition  transforms  LoG-convolution  into  the  sum  of  two  separable 
convolutions,  so  that  the  computational  times  are  essentially  given  by  those  for 
Method  A ,  except  that  the  figures  should  be  doubled  and  an  additional  sum  of  two 
images  will  be  required  for  the  output. 

4.  Zero-crossings 

To  compute  the  zero-crossings  of  a  function  of  two  variables,  h(i,j),  the  data 
should  be  thresholded  at  zero,  and  border  points  identified  in  the  resulting  binary 
image.  We  mention  two  possible  definitions  for  a  border  point.  A  border  point  can 
be  defined  as  a  pixel  that  contains  a  pixel  of  a  different  binary  value  among  its  four 
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(or  eight)  neighbors.  In  this  case,  edges  will  be  two  pixels  thick.  Alternatively,  a 
border  point  can  be  defined  as  a  "1"  pixel  with  a  "0"  pixel  among  its  neighbors. 
One  way  of  finding  border  pixels  using  the  second  definition,  appropriate  when  fast 
three-by-three  convolution  hardware  is  available /iis^t(9  blur  the  binary  image  using 
the  mask  with  I's  in  the  eight  neighbors  and  9  in  ^jCfin^eF; i 

1    1    1" 

1   9   t 

.1    1    1 

and  then  thresholding  the  result  so  that  values  9  and  above,  but  less  than  17,  are 
identified  as  border  pixels.   (Here  we  have  used  eight  pixel  neighborhoods.) 

Haralick  has  suggested  that  zero-crossings  of  the  second  directional  derivative, 
where  the  direction  of  differentiation  is  the  direction  of  the  local  gradient  [9],  will 
tend  to  more  accurately  locate  edges  as  compared  to  the  Marr-Hildreth  edge  opera- 
tor. It  should  be  noted  that  this  operator  also  yields  closed  contours  for  the  zero- 
crossings,  and  can  be  computed  by  finding  zero-crossings  of: 

•  ■  •  n^'  f.   : .  ■    ,-  '1    ; 
where  a  subscript  denotes  a  partial  derivative.  The  nonlinear  second  order  partial 

differential  operator  on  the  right  hand  side,  is  m  fact  the  directional  derivative  scaled 
by  the  square  of  the  gradient,  and  is  thus  defined  even  when  the  gradient  is  zero. 
Canny  [10]  notes  that  the  operator  can  be  formulated  as  (1/2)V/-V  |V/p.  We  used 
a  different  computation,  formulating  the  partial  derivatives  by  using  local  three-by- 
three  convolutions.   Specifically  we  have  used: 


/-i 

-1   0   1 

f'-j 

1       2     1 

-2  0  2 

» 

0      0     0 

9 

-1   0   1 

u 

-1    -2   1 

0     0     0 

-10    1  ■ 

0     1     0 

1    -2   1 

0     0     0 

'  fyy  ~ 

0-2  0 

0     0     0 

,10-1 

0     1     0 

fxx     ~ 


The  discretized  h(i,j)  is  a  sum  of  products  of  local  convolutions  of  image  data  as 
given  by  these  (or  related)  kernels.  The  image  data  can  be  the  original  image,  or  an 
appropriately  blurred  version  of  the  image  data.  Haralick  recommends  that  the  par- 
tial derivatives  of /be  computed  using  the  "facet  model,"  so  that  the  smoothing  is 
impUcit  in  the  representation  of  the  parameters  of  the  local  facet  for  /.  Other 
methods  of  blurring  can  be  used,  including  Gaussian  blur,  although  this  method  of 
computation  does  not  work  well  with  images  that  have  been  substantially  blurred  by 
a  Gaussian  due  to  the  small  magnitude  of  the  local  gradient  of  the  blurred  image. 
In  Image  4a,  we  show  the  zero-crossings  of  the  second  directional  derivative,  after 
the  image  has  been  only  slightly  blurred  ia~J).  This  should  be  compared  with 
zero-crossings  of  the  Laplacian-of-Gaussian  of  the  same  image  at  a  comparable  scale 
(Image  4b) . 

Whether  a  DoG,  LoG,  or  second  directional  derivative  is  used,  the  zero- 
crossings  do  not  necessarily  track  the  edges  accurately.  Prominent  edges  will  gen- 
erally have  a  zero-crossing  curve  associated  with  them  (at  some  scale  of  resolution). 


Page  17 


Computing  Large-Kernel  Convolutions  of  Images 


but  the  zero-crossings  will  tend  to  wander  off  of  the  edges  and  complete  loops  for 
which  the  main  edge  for    s  only  a  portion  of  the  loop.   Thus  it  is  desired  to  tag  pix- 

ssing  with  a  measure  of  the  edge  strength.  One  way  of 
'-^  I  -actiec,  is  to  measure  the  slope  of  the  filtered  image 
^«C^,es  usually  give  rise  to  zero-crossings  for  which 
:^^  Thus  a  measure  of  importance  is  given  by 
<i.  "'USj^e  AG*/(ac,y)  =  0.  This  idea,  for  example,  is 
I  "'Vx^e  slope  at  a  zero-crossing  can  be  computed  by 
i  -  uf  the  Laplacian-of-Gaussian  filtered  image.  We 
have  found  that  by  throwiiig  o»,i  >rf>'Crossing  points  where  the  Sobel  gradient  mag- 
nitude is  below  a  threshold,  the  r'-.iiiaining  curves  form  a  much  more  accurate  depic- 
tion of  the  edges  in  the  imagt  Tmage  5  shows  the  zero-crossings  filtered  by  the 
Sobel  gradient,  corresponding  to  r'i$  unfiltered  zero-crossings  shown  in  Image  3. 


els  identified  as  a  zero-^ 
doing  this,  used  frequsrt 
at  the  zero-crossing.    ''^' 
the  crossing  has  a  l 
|V(AG*/)|(x,y),atp 
mentioned  in  Marr's  h. 
taking  a  Sobel  gradient  j^ 


Image  4a.  The  zero-crossings  of  the  second  directional  derivative,  computed  in  the 
direction  of  the  gradient  (the  Haralick  operator).  In  regions  where  the  gradient  is 
zero,  the  operator  returns  zero,  and  thus  zero-crossings  do  not  occur,  since  the  bord- 
er points  are  defined  as  pixels  that  have  a  positive  value  with  a  nonpositive  value  in 
the  neighborhood.  The  original  image  was  slightly  blurred  by  a  Gaussian.  The  ef- 
fective (7  is  roughly  0.7  pixels. 
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Image  4b.  The  zero-crossings  of  a  LapIacian-of-Gaussian  convolution  of  the  image, 
taken  at  approximately  the  same  scale  of  resolution  as  Image  4a.  We  note  that  the 
Laplacian-of-Gaussian  tends  to  introduce  a  number  of  extraneous  zero-crossings,  and 
tends  to  be  slightiy  less  accurate  near  comers. 
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%^JS^ 


'  '■,  .  ■  .-J  '-■ 


Image  5.  The  zero-crossings  of  Image  3,  with  a  brightness  encoding  of  the  Sobel 
gradient  magnitude  of  the  Laplacian  of  the  Gaussian  of  the  image  data  at  those 
zero-crossings. 


Here,  we  have  simply  displayed  the  zero-crossings  with  the  intensity  proportional  to 
the  magnitude  of  the  Sobel  gradient  at  the  zero-crossing.  When  this  image  is  thres- 
holded,  the  result  is  curves  (which  are  no  longer  closed  contours)  with  some  of  the 
weak  edges  removed. 

Alternatively,  it  is  possible  to  "and"  the  results  of  finding  zero-crossing  con- 
tours at  several  successive  scales  of  resolution.  Providing  the  contours  are  thick- 
ened in  all  but  one  of  the  scales  before  the  "and"-ing,  the  result  again  is  a  collection 
of  curves,  not  necessarily  closed,  but  associated  with  prominent  edges.  Marr  and 
Hildreth  speculated  on  methods  for  combining  zero-crossings  from  multiple  chan- 
nels (scales),  and  conjunctive  combination  was  one  suggestion  [1].    In  essence. 
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"and"-ing  zero-crossings  at  successive  levels  demands  that  zero-crossings  not  move 
rapidly  as  the  scale  changes.  Rapid  movement  is  possible  only  if  the  gradient  at  the 
zero-crossing  is  small,  although  it  can  happen  Sji^^j!^^;^ossings  with  low  gradient 
also  survive  "and"-ing  at  several  levels. 

5.  Borders 

All  of  the  convolutions  discussed  so  fai!il  Jie  original  data  f{x,y)  is 

defined  for  {x,y)^TR}  and  that  the  grid  of  pi^ii,  j;is  an  infinite  lattice.    Gen- 

erally, however,  images  are  defined  on  a  finiteti'.^-.iiiiigular  grid  of  points,  or  some- 
times on  an  irregularly  shaped  grid.  Suppose;  it&at  / (1,7)  is  defined  for  (1,7)  €2?, 
where  I?  is  a  domain  of  grid  points.  We  define  the  boundary  points  of  the  domain 
as  points  in  V  with  at  least  one  of  the  eight  neiglibors  outside  of  P ,  and  denote 
those  points  by  dV .  How  should  the  infinite  convolutions  be  defined? 

A  usual  tactic  is  to  simply  extend  fX^J),  tOi^  ero  for  pixels  outside  of  V.  For 
Gaussian  convolution,  however,  .therc^ai-i  '  nate  approaches,  motivated  by 
regarding  Gaussian  convolution  as  a ^-iS^arji^.jiii  solving  the  Heat  equation  [12]. 
Specifically,  we  can  use  the  formulas 'fii^^iteJE^fVe  Islurring  in  conjunction  with  fixed 
boundary  data:  &;  |^i|€  vS'^  " 

16/i^+i(z,;)  =  hN(i-lJ-l)+pNii-l,j)+hNii-l,j  +  l) 

+hN(i  +  l,j-l)  +  2hNii  +  lJ)+hN(i  +  lJ  +  l) 

for  (iJ)^V-dV,  and  Hn+iUJ)  ==  hN(i,j)  for  (i,j)^dV.  Although  this  yields 
something  different  from  Gaussian  blurring,  the  result  is  an  appropriate  blurring  of 
the  image  data.  An  analog  to  difference-of-Gaussian  filters  can  then  be  computed. 
Note  that  such  difference  filters,  when  defined  this  way,  will  always  give  zero 
values  on  the  boundary. 

Another  possibility  for  handling  borders  is  to,  in  essence,  insist  that  the  normal 
derivative  at  the  borders  be  zero.  In  terms  of  the  discrete  blurring  kernels,  this 
condition  can  be  implemented  (for  a  convex  grid  of  points  V,  and  when  the  kernel  is 
no  bigger  than  three-by-three)  by  the  following  rule: 

Every  access  to  a  pixel  outside  of  the  grid  T>  should  substitute  for  that  pixel  the 
value  of  the  closest  pixel  in  the  grid  V. 

Thus,  for  example,  suppose  that  V  consists  of  the  rectangular  array  of  pixels 
{(ij)  I  0  S  /  s  511,  0  <  ;■  <  511}.  Then  to  update  a  pixel  on  the  left  edge,  (/,0), 
with  1  ^  I  ^  510  (not  a  corner  point),  the  formula  is 

hN+i(i,Q)  =  (1/16)-  [3M/-I.O)  +  hN(i-l,l) 

+  6hNU,0)  +  2hN(i,l)  +  3hN(i+l,0)  +  M/  + 1,1)1  . 

Similarly,  in  the  upper  left  corner,  the  update  rule  is 

hN+i{0,0)  =  (1/16)-  [qMO.O)  +  3hN(0,l)  +  3M1.0)  +  /i^(l,l)]  . 

For  such  a  rectangular  grid  D,  the  rule  can  be  seen  to  be  equivalent  to  first  blurring 
the  rows  by  the  formulas  using  the  masks 

Page  21 


Compating  Large-Kernel  CooTolntions  of  Images 


1/4   1/2   1/4      at  interior  points, 
[o      3/4   1/4]     at  left  edges, 
at  right  edges. 


and  then  blurring  the  ri^sultijii^.^a^mns  by  similar  masks  oriented  vertically. 

This  method  of  updating  has  the  property  that  the  global  mean  is  constant,  i.e., 
2  ^^Nii'j)  is  independ-ft  cf  A^.    Once  again,  differences  of  levels  can  be  formed 

i    J 

to  construct  analogs  to  the  DoG  and  LoG  structure.  Image  6a  shows  the  finite  rec- 
tangular image  of  Image  1  bluried  by  this  method.  This  blurring  can  be  compared 
to  the  image  that  results  if  the  original  image  is  blurred  on  an  infinite  domain,  and 
then  restricted  to  the  original  size  (Image  6b). 
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Image  6.  The  left  image  (Image  6a)  is  a  128  by  128  image  blurred  by  the  iterative 
method,  keeping  the  normal  derivative  at  the  borders  equal  to  zero.  The  average  in- 
tensity value  is  kept  constant  by  this  method.  ^  =  32  stages  of  blurring  were  used, 
corresponding  to  7  =  4  in  a  Gaussian  convolution.  On  the  right  (Image  6b),  the 
same  image  is  blurred  an  equivalent  amount  by  the  iterative  method,  assuming  that 
pixels  outside  of  the  image  domain  have  value  zero.  The  average  intensity  value 
drops  as  the  number  of  blurring  iterations  increases  when  using  this  method.  All 
difference  will  occur  within  32  pixels  of  the  border. 
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